Linear stochastic approximation driven by slowly varying Markov chains
نویسندگان
چکیده
We study a linear stochastic approximation algorithm that arises in the context of reinforcement learning. The algorithm employs a decreasing step-size, and is driven by Markov noise with time-varying statistics. We show that under suitable conditions, the algorithm can track the changes in the statistics of the Markov noise, as long as these changes are slower than the rate at which the step-size of the algorithm goes to zero. c © 2003 Elsevier B.V. All rights reserved.
منابع مشابه
Approximating Queues in Slowly Varying Station- Ary Environments
We provide linear approximations to the marginal distributions for a class of infinite-state continuous-time stationary Markov chains in slowly varying environents. We take an approach motivated by light-traffic approximations to stationary point processes, which permits us to consider general stationary environments. Under mild assumptions we show that Jackson networks with routing not affecte...
متن کاملMarkov Chains Approximation of Jump-Diffusion Quantum Trajectories
“Quantum trajectories” are solutions of stochastic differential equations also called Belavkin or Stochastic Schrödinger Equations. They describe random phenomena in quantum measurement theory. Two types of such equations are usually considered, one is driven by a one-dimensional Brownian motion and the other is driven by a counting process. In this article, we present a way to obtain more adva...
متن کاملOn Markov Chain Approximations to Semilinear Partial Differential Equations Driven by Poisson Measure Noise
We consider the stochastic model of water pollution, which mathematically can be written with a stochastic partial differential equation driven by Poisson measure noise. We use a stochastic particle Markov chain method to produce an implementable approximate solution. Our main result is the annealed law of large numbers establishing convergence in probability of our Markov chains to the solutio...
متن کاملPersistent tracking and identification of regime-switching systems with structural uncertainties: unmodeled dynamics, observation bias, and nonlinear model mismatch
This work focuses on tracking and system identification of systems with regime-switching parameters, which are modeled by a Markov process. It introduces a framework for persistent identification problems that encompass many typical system uncertainties, including parameter switching, stochastic observation disturbances, deterministic unmodeled dynamics, sensor observation bias, and nonlinear m...
متن کاملStochastic Dynamic Programming with Markov Chains for Optimal Sustainable Control of the Forest Sector with Continuous Cover Forestry
We present a stochastic dynamic programming approach with Markov chains for optimal control of the forest sector. The forest is managed via continuous cover forestry and the complete system is sustainable. Forest industry production, logistic solutions and harvest levels are optimized based on the sequentially revealed states of the markets. Adaptive full system optimization is necessary for co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Systems & Control Letters
دوره 50 شماره
صفحات -
تاریخ انتشار 2003